64 research outputs found

    Corpus based classification of text in Australian contracts

    No full text
    Written contracts are a fundamental framework for commercial and cooperative transactions and relationships. Limited research has been published on the application of machine learning and natural language processing (NLP) to contracts. In this paper we report the classification of components of contract texts using machine learning and hand-coded methods. Authors studying a range of domains have found that combining machine learning and rule based approaches increases accuracy of machine learning. We find similar results which suggest the utility of considering leveraging hand coded classification rules for machine learning. We attained an average accuracy of 83.48% on a multiclass labelling task on 20 contracts combining machine learning and rule based approaches, increasing performance over machine learning alone

    A Right to Access Implies A Right to Know: An Open Online Platform for Research on the Readability of Law

    Get PDF
    The widespread availability of legal materials online has opened the law to a new and greatly expanded readership. These new readers need the law to be readable by them when they encounter it. However, the available empirical research supports a conclusion that legislation is difficult to read if not incomprehensible to most citizens. We review approaches that have been used to measure the readability of text including readability metrics, cloze testing and application of machine learning. We report the creation and testing of an open online platform for readability research. This platform is made available to researchers interested in undertaking research on the readability of legal materials. To demonstrate the capabilities ofthe platform, we report its initial application to a corpus of legislation. Linguistic characteristics are extracted using the platform and then used as input features for machine learning using the Weka package. Wide divergences are found between sentences in a corpus of legislation and those in a corpus of graded reading material or in the Brown corpus (a balanced corpus of English written genres). Readability metrics are found to be of little value in classifying sentences by grade reading level (noting that such metrics were not designed to be used with isolated sentences)

    Fast On-line Statistical Learning on a GPGPU

    Get PDF
    On-line Machine Learning using Stochastic Gradient Descent is an inherently sequential computation. This makes it difficult to improve performance by simply employing parallel architectures. Langford et al. made a modification to the standard stochastic gradient descent approach which opens up the possibility of parallel computation. They also proved that there is no significant loss in accuracy in their approach. They did empirically demonstrate the performance gain in speed for the case of a pipelined architecture with a few processing units. In this paper we report on applying the Langford et al. approach on a General Purpose Graphics Processing Unit (GPGPU) with a large number of processing units. We accelerate the learning speed by approximately 4.5 times compared to a standard single threaded approach with comparable accuracy. We also evaluate the GPU performance for the sequential variant of the algorithm, which has not previously been reported. Finally, we investigate how changes in the number of threads, number of blocks, and amount of delay, effects the overall performance and accuracy

    Citizen Science for Citizen Access to Law

    Get PDF
    This papers sits at the intersection of citizen access to law, legal informatics and plain language. The paper reports the results of a joint project of the Cornell University Legal Information Institute and the Australian National University which collected thousands of crowdsourced assessments of the readability of law through the Cornell LII site. The aim of the project is to enhance accuracy in the prediction of the readability of legal sentences. The study requested readers on legislative pages of the LII site to rate passages from the United States Code and the Code of Federal Regulations and other texts for readability and other characteristics. The research provides insight into who uses legal rules and how they do so. The study enables conclusions to be drawn as to the current readability of law and spread of readability among legal rules. The research is intended to enable the creation of a dataset of legal rules labelled by human judges as to readability. Such a dataset, in combination with machine learning, will assist in identifying factors in legal language which impede readability and access for citizens. As far as we are aware, this research is the largest ever study of readability and usability of legal language and the first research which has applied crowdsourcing to such an investigation. The research is an example of the possibilities open for enhancing access to law through engagement of end users in the online legal publishing environment for enhancement of legal accessibility and through collaboration between legal publishers and researchers

    Randomized controlled trial of a coordinated care intervention to improve risk factor control after stroke or transient ischemic attack in the safety net: Secondary stroke prevention by Uniting Community and Chronic care model teams Early to End Disparities (SUCCEED).

    Get PDF
    BackgroundRecurrent strokes are preventable through awareness and control of risk factors such as hypertension, and through lifestyle changes such as healthier diets, greater physical activity, and smoking cessation. However, vascular risk factor control is frequently poor among stroke survivors, particularly among socio-economically disadvantaged blacks, Latinos and other people of color. The Chronic Care Model (CCM) is an effective framework for multi-component interventions aimed at improving care processes and outcomes for individuals with chronic disease. In addition, community health workers (CHWs) have played an integral role in reducing health disparities; however, their effectiveness in reducing vascular risk among stroke survivors remains unknown. Our objectives are to develop, test, and assess the economic value of a CCM-based intervention using an Advanced Practice Clinician (APC)-CHW team to improve risk factor control after stroke in an under-resourced, racially/ethnically diverse population.Methods/designIn this single-blind randomized controlled trial, 516 adults (≄40 years) with an ischemic stroke, transient ischemic attack or intracerebral hemorrhage within the prior 90 days are being enrolled at five sites within the Los Angeles County safety-net setting and randomized 1:1 to intervention vs usual care. Participants are excluded if they do not speak English, Spanish, Cantonese, Mandarin, or Korean or if they are unable to consent. The intervention includes a minimum of three clinic visits in the healthcare setting, three home visits, and Chronic Disease Self-Management Program group workshops in community venues. The primary outcome is blood pressure (BP) control (systolic BP <130 mmHg) at 1 year. Secondary outcomes include: (1) mean change in systolic BP; (2) control of other vascular risk factors including lipids and hemoglobin A1c, (3) inflammation (C reactive protein [CRP]), (4) medication adherence, (5) lifestyle factors (smoking, diet, and physical activity), (6) estimated relative reduction in risk for recurrent stroke or myocardial infarction (MI), and (7) cost-effectiveness of the intervention versus usual care.DiscussionIf this multi-component interdisciplinary intervention is shown to be effective in improving risk factor control after stroke, it may serve as a model that can be used internationally to reduce race/ethnic and socioeconomic disparities in stroke in resource-constrained settings.Trial registrationClinicalTrials.gov Identifier NCT01763203

    Randomized controlled trial of a coordinated care intervention to improve risk factor control after stroke or transient ischemic attack in the safety net: Secondary stroke prevention by Uniting Community and Chronic care model teams Early to End Disparities (SUCCEED)

    Get PDF

    Effect of angiotensin-converting enzyme inhibitor and angiotensin receptor blocker initiation on organ support-free days in patients hospitalized with COVID-19

    Get PDF
    IMPORTANCE Overactivation of the renin-angiotensin system (RAS) may contribute to poor clinical outcomes in patients with COVID-19. Objective To determine whether angiotensin-converting enzyme (ACE) inhibitor or angiotensin receptor blocker (ARB) initiation improves outcomes in patients hospitalized for COVID-19. DESIGN, SETTING, AND PARTICIPANTS In an ongoing, adaptive platform randomized clinical trial, 721 critically ill and 58 non–critically ill hospitalized adults were randomized to receive an RAS inhibitor or control between March 16, 2021, and February 25, 2022, at 69 sites in 7 countries (final follow-up on June 1, 2022). INTERVENTIONS Patients were randomized to receive open-label initiation of an ACE inhibitor (n = 257), ARB (n = 248), ARB in combination with DMX-200 (a chemokine receptor-2 inhibitor; n = 10), or no RAS inhibitor (control; n = 264) for up to 10 days. MAIN OUTCOMES AND MEASURES The primary outcome was organ support–free days, a composite of hospital survival and days alive without cardiovascular or respiratory organ support through 21 days. The primary analysis was a bayesian cumulative logistic model. Odds ratios (ORs) greater than 1 represent improved outcomes. RESULTS On February 25, 2022, enrollment was discontinued due to safety concerns. Among 679 critically ill patients with available primary outcome data, the median age was 56 years and 239 participants (35.2%) were women. Median (IQR) organ support–free days among critically ill patients was 10 (–1 to 16) in the ACE inhibitor group (n = 231), 8 (–1 to 17) in the ARB group (n = 217), and 12 (0 to 17) in the control group (n = 231) (median adjusted odds ratios of 0.77 [95% bayesian credible interval, 0.58-1.06] for improvement for ACE inhibitor and 0.76 [95% credible interval, 0.56-1.05] for ARB compared with control). The posterior probabilities that ACE inhibitors and ARBs worsened organ support–free days compared with control were 94.9% and 95.4%, respectively. Hospital survival occurred in 166 of 231 critically ill participants (71.9%) in the ACE inhibitor group, 152 of 217 (70.0%) in the ARB group, and 182 of 231 (78.8%) in the control group (posterior probabilities that ACE inhibitor and ARB worsened hospital survival compared with control were 95.3% and 98.1%, respectively). CONCLUSIONS AND RELEVANCE In this trial, among critically ill adults with COVID-19, initiation of an ACE inhibitor or ARB did not improve, and likely worsened, clinical outcomes. TRIAL REGISTRATION ClinicalTrials.gov Identifier: NCT0273570

    Induction in first order logic from noisy training examples and fixed example set size

    No full text
    Abstract This dissertation investigates the field of inductive logic programming (ILP) and in so doing an ILP system, Lime, is designed and developed. Lime addresses the problem of noisy training examples; learning from only positive, only negative, or both positive and negative examples; efficiently biasing and searching the hypothesis space; and handling recursion efficiently and effectively. The Q-heuristic is introduced to address the problem of learning with both noisy training examples and fixed numbers of positive and negative training examples. This heuristics is based on Bayes rule. Both a justification of its derivation and a description of the context in which it is appropriately applied are given. Because of the general nature of this heuristic its application is not restricted to ILP. Instead of employing a greedy covering approach to constructing clauses, Lime employs the Qheuristic to evaluate entire logic programs as hypotheses. To tame the inevitable explosion in the search space, the notion of a simple clause is introduced. These sets of literals may be viewed as subparts of clauses that are effectively independent in terms of variables used. Instead of growing a clause one literal at a time, Lime efficiently combines simple clauses to construct a set of gainful candidate clauses. Subsets of these candidate clauses are evaluated using the Q-heuristic to find the final hypothesis. Details of the algorithms and data structures of Lime are discussed. Lime's handling of recursive logic programs is also described. Experimental results are provided to illustrate how Lime achieves its design goals of better noise handling, learning from a fixed set of examples (e.g., from only positive data), and of learning recursive logic programs. These results compare the performance of Lime with other leading ILP systems like Foil and Progol in a variety of domains. Empirical results with a boosted version of Lime are also reported

    Partial Matching of Planar Polygons Under Translation and Rotation

    No full text
    • 

    corecore